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DETAILED ACTION 



Specification 



1 . The disclosure is objected to because of the following informalities: The phrase 
"acoustic signals" is a confusing phrase (page 3, ln.4 and 9) because all audio signals 
are "acoustic signals". 

Appropriate correction is required. 



2. The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless - 

(e) the invention was described in (1) an application for patent, published under section 122(b), by 
another filed in the United States before the invention by the applicant for patent or (2) a patent 
granted on an application for patent by another filed in the United States before the invention by the 
applicant for patent, except that an international application filed under the treaty defined in section 
351(a) shall have the effects for purposes of this subsection of an application filed in the United States 
only if the international application designated the United States and was published under Article 21(2) 
of such treaty in the English language. 



3. Claims 1-2, 4, 9-10, 12, 17-18, 20, 25-26, 28, 33-34, 36, 42-43, 45 are rejected 
under 35 U.S.C. 102(e) as being anticipated by Takahashi et al. (U.S. Patent No. 
6,347,185). 

Referring to claims 1 , 9, 17 and 25, Takahashi et al. disclose a method and an 
apparatus for classifying signals and generating descriptors comprising: 



Claim Rejections - 35 USC § 102 
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dividing an input signal into blocks having a predetermined time length (col.4, 
ln.21-23); 

extracting one or more characteristic quantities (e.g. frequency value, power 
value, peak frequency) of a signal attribute from the signal of each block (col.5, ln.49- 
65); and 

classifying the signal of each block into a category (e.g. mute, music, human 
speech) according to the characteristic quantities (e.g. frequency value, power value, 
peak frequency; col.5, ln.66 - col.6, ln.2 and ln.36-54). 

Referring to claims 2, 10, 18 and 26, Takahashi et al. disclose the method and 
the apparatus for classifying signals and generating descriptors, wherein the signal of 
each block is classified into any of the categories formed on the basis of types of signal 
sources (col.2, ln.5-11 and col.3, In. 18-21). 

Referring to claims 4, 12, 20 and 28, Takahashi et al. disclose the method and 
the apparatus for classifying signals and generating descriptors, wherein the input 
signal is an audio signal (col.3, In. 46-47); and 

the categories formed on the basis of signal sources for classifying the audio 
signal of each block includes one or more than one of silence, voice, male voice, female 
voice, music, vocal music, instrumental music, noise, striking sound, environmental 
sound, sound of hustle and bustle, clapping sound and cheering sound and are used for 
the categorical classification based on the sound sources (col.4, ln.28-31). 
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Referring to claims 33 and 42, Takahashi et al. disclose a method and an 
apparatus for retrieving signals comprising: 

dividing an input signal into blocks having a predetermined time length (col.4, 
ln.21-23); 

extracting one or more characteristic quantities (e.g. frequency value, power 
value, peak frequency) of a signal attribute from the signal of each block (col. 5, In. 49- 
65); and 

classifying the signal of each block into a category (e.g. mute, music, human 
speech) according to the characteristic quantities (e.g. frequency value, power value, 
peak frequency; col.5, ln.66 - col.6, ln.2 and ln.36-54). 

retrieving the signal according to the result of categorical classification or by 
using a descriptor (N See's Type) generated according to the category of classification 
(col.9, ln.11-42). 

Referring to claims 34 and 43, Takahashi et al. disclose the method and the 
apparatus for retrieving signals, wherein the signal of each block is classified into any of 
the categories formed on the basis of types of signal sources (col .2, ln.5-1 1 and col. 3, 
In. 18-21). 
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Referring to claims 36 and 45, Takahashi et al. disclose the method and the 
apparatus for retrieving signals, wherein the input signal is an audio signal (col. 3, ln.46- 



the categories formed on the basis of signal sources for classifying the audio 
signal of each block includes one or than one of silence, voice, male voice, female 
voice, music, vocal music, instrumental music, noise, striking sound, environmental 
sound, sound of hustle and bustle, clapping sound and cheering sound and are used for 
the categorical classification based on the sound sources (col.4, ln.28-31); and 

a signal is retrieved by using the descriptor reflecting or corresponding to the 
result of the categorical classification based on the sound sources (col.9, In. 11-42). 



4. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 



Patentability shall not be negatived by the manner in which the invention was made. 
5. Claims 3, 5, 1 1, 13, 19, 21, 27, 29, 35, 37, 41,44, 46 and 50 are rejected under 
35 U.S.C. 103(a) as being unpatentable over Takahashi et al. in view of Lindemann 
(U.S. Patent No. 6,316,710). 



47); and 



Claim Rejections - 35 USC § 103 



Referring to claims 3, 1 1, 19 and 27, Takahashi et al. do not specifically disclose 
a method and an apparatus for classifying signals and generating descriptors, wherein 
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the signal of each block is classified into any of the categories formed on the basis of 
types structures that signals may have and do not depend on the types of signal 
sources. 

However, Lindemann teaches a method and an apparatus for classifying and 
generating descriptors, wherein the signal of each block into any of the categories 
formed on the basis of types structures (base on noise, silence and musical gesture 
types) that signals may have and do not depend on the types of signal sources (Fig.2 
and col.4, In. 17-35). The advantage of using the teaching of Lindemann in Takahashi et 
al. is to determine the characteristics of the sound unit (col.3, ln.60-61). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time of invention to modify the method or the apparatus of Takahashi et al. which 
able to classify the signal of each block into any of the categories formed on the basis of 
types structures that signals may have and do not depend on the types of signal 
sources in order to provide smooth transitions for slurs by artificially manipulating the 
data associated with isolated sound recordings, as taught by Lindemann (col .2, ln.44- 
46). 

Referring to claims 5, 13, 21 and 29, Takahashi et al. disclose the method and 
the apparatus for classifying signals and generating descriptors, wherein the input 
signal is an audio signal (col.3, ln.46-47). 

Takahashi et al. do not specifically disclose a method and an apparatus for 
classifying signals and generating descriptors, wherein 
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the categories formed on the basis of structures that signals may have and do not 
depend on the types of signal sources for classifying the audio signal of each block 
includes a silence structure where no significant sound exists in the block. 

However, Lindemann teaches a method and an apparatus for classifying signals 
and generating descriptors, wherein the audio signal of each block includes a silence 
structure where no significant sound exists in the block (col .4, ln.33 and Fig.1 , elements 
#1 1 0, #1 1 4 and #1 1 8). The advantage of using the teaching of Lindemann in Takahashi 
et al. is to determine the characteristics of the sound unit (col .3, In. 60-61 ). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time of invention to modify the method or the apparatus of Takahashi et al. which 
able to classify the signal of each block into any of the categories formed on the basis of 
types structures in order to provide smooth transitions for slurs by artificially 
manipulating the data associated with isolated sound recordings, as taught by 
Lindemann (col .2, ln.44-46). 

Referring to claims 35 and 44, Takahashi et al. do not specifically disclose a 
method and an apparatus for retrieving signals, wherein the signal of each block is 
classified into any of the categories formed on the basis of types structures that signals 
may have and do not depend on the types of signal sources. 

However, Lindemann teaches a method and an apparatus for retrieving signals, 
wherein the signal of each block is classified into any of the categories formed on the 
basis of types structures (base on noise, silence and musical gesture types) that signals 
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may have and do not depend on the types of signal sources (Fig.2 and col.4, In. 17-35). 
The advantage of using the teaching of Lindemann in Takahashi et al. is to determine 
the characteristics of the sound unit (col.3, ln.60-61). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time of invention to modify the method or the apparatus of Takahashi et al. which 
able to classify the signal of each block into any of the categories formed on the basis of 
types structures that signals may have and do not depend on the types of signal 
sources in order to provide smooth transitions for slurs by artificially manipulating the 
data associated with isolated sound recordings, as taught by Lindemann (col.2, ln.44- 
46). 



Referring to claims 37 and 46, Takahashi et al. disclose the method and the 
apparatus for retrieving signals, wherein the input signal is an audio signal (col.3, ln.46- 
47); and 

a signal is retrieved by using the descriptor reflecting or corresponding to the 
result of the categorical classification based on the sound sources (col.9, ln.1 1-42). 

Takahashi et al. do not specifically disclose a method and an apparatus for 
retrieving signals, wherein the categories formed on the basis of structures that signals 
may have and do not depend on the types of signal sources for classifying the audio 
signal of each block includes one or more of a silence structure where no significant 
sound exists in the block. 
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However, Lindemann teaches a method and an apparatus for retrieving signals, 
wherein the audio signal of each block includes one or more of a silence structure 
where no significant sound exists in the block (col.4, In. 33 and Fig.1, elements #110, 
#114 and #1 1 8). The advantage of using the teaching of Lindemann in Takahashi et al. 
is to determine the characteristics of the sound unit (col.3, ln.60-61). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time of invention to modify the method or the apparatus of Takahashi et al. which 
able to classify the signal of each block into any of the categories formed on the basis of 
types structures in order to provide smooth transitions for slurs by artificially 
manipulating the data associated with isolated sound recordings, as taught by 
Lindemann (col.2, In. 44-46). 



Referring to claims 41 and 50, Takahashi et al. do not specifically disclose a 
method and an apparatus for retrieving signals, wherein the points of changes of the 
signal are detected by using the descriptor reflecting or corresponding to the result of 
the categorical classification. 

However, Lindemann teaches a method and an apparatus for retrieving signals, 
wherein points of changes of the signal are detected by using the descriptor reflecting or 
corresponding to the result of the categorical classification (e.g. attack, release, 
transition, sustain and silence; col.4, In. 17-35; col.5, In. 39-42 and col.9, ln.26-31). The 
advantage of using the teaching of Lindemann in Takahashi et al. is to determine the 
characteristics of the sound unit (col.3, ln.60-61). 
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Therefore, it would have been obvious to one having ordinary skill in the art at 
the time of invention to modify the method or the apparatus of Takahashi et al. which 
able to determine the points of changes of the signal in order to provide smooth 
transitions for slurs by artificially manipulating the data associated with isolated sound 
recordings, as taught by Lindemann (col.2, ln.44-46). 

6. Claims 8, 16, 24, 32, 40 and 49 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Takahashi et al. in view of Wu et al. (U.S. Patent No. 6,006,179). 

Referring to claims 8, 16, 24, 32, 40 and 49, Takahashi et al. do not specifically 
disclose a method and an apparatus for classifying signals, generating descriptors and 
retrieving signals, wherein a vector quantization technique is used as method for the 
categorical classification. 

However, Wu et al. teach a method and an apparatus for classifying signals, 
generating descriptors and retrieving signals, wherein a vector quantization technique is 
used as method for the categorical classification (col J, ln.39-42 and 50-52). The 
advantage of using the teaching of Wu et al. in Takahashi et al. is to search for the best 
possible quantization for a given input vector (col .4, ln.25-27). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time of invention to modify the method or the apparatus of Takahashi et al. which 
has a vector quantization technique is used as method for the categorical classification 
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in order to reduce computational complexity and natural compatibility, as taught by Wu 
et al. (col.5, ln.65-66). 

7. Claims 6-7, 14-15, 22-23, 30-31 , 38-39 and 47-48 are rejected under 35 U.S.C. 
103(a) as being unpatentable over Takahashi et al. in view of Pertrushin (U.S. Patent 
No. 6,151,571). 

Referring to claims 6, 14, 22, 30, 38 and 47, Takahashi et al. do not specifically 
disclose a method and apparatus for classifying signals, generating descriptors and 
retrieving signals, wherein the average and variances of the signal power in the block 
are used as the characteristic quantities. 

However, Pertrushin teaches a method and apparatus for classifying signals, 
generating descriptors and retrieving signals, wherein the average power (energy, 
col. 12, ln.56) and variance (energy standard deviation is square root of variance, col. 12, 
ln.62-64 and col. 13, ln.9) of the signal in the block are used as the characteristic 
quantities. 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time of invention to modify the method of Takahashi et al. which able to classify the 
average power and the variance of the voice signal to allow detection of an emotion of a 
caller in order to provide a quick and effective response when the caller's emotional 
state is in distress, as taught by Pertrushin (col.22, In. 1-4). 
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Referring to claims 7, 15, 23, 31, 39 and 48, Takahashi et al. do not specifically 
disclose a method and apparatus for classifying signals, generating descriptors and 
retrieving signals, wherein the average harmonic energy is a temporal average of the 
ratio of the energy of the sound component of the integer times of pitch frequency to the 
energy of all the frequencies, and the standard deviation energy is a temporal standard 
deviation of the ratio of the energy of the sound component of the integer times of pitch 
frequency to the energy of all the frequencies. 

However, Pertrushin teaches a method and apparatus for classifying signals, 
generating descriptors and retrieving signals, wherein 

the average harmonic energy (relative voiced energy) is a temporal average of 
the ratio of the energy of the sound component of the integer times of pitch frequency to 
the energy of all the frequencies (col.12, ln.62-col.13, ln.2), and 

the standard deviation energy (energy standard deviation) is a temporal standard 
deviation of the ratio of the energy of the sound component of the integer times of pitch 
frequency to the energy of all the frequencies (col.1 3, ln.9). 

Therefore, it would have been obvious to one having ordinary skill in the art at 
the time of invention to modify the method of Takahashi et al. which able to classify the 
average harmonic energy and energy standard deviation of the voice signal to allow 
detection of an emotion of caller in order to provide a quick and effective response when 
the caller's emotional state is in distress, as taught by Pertrushin (col. 22, In. 1-4). 
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Conclusion 



8. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure Imai et al. (U.S. Patent No. 6,236,970) teach a speech-rate 
converter slowing down input speech regularly monitors the data length of the 
inputspeech and the previously estimated extended output data length for the current 
rate scaling factor, computing new output data estimate. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to the examiner Vincent V. Tran whose E-mail address: 

Vincent.tran@USPTO.GOV . 

Phone number: (703) 305-1817 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Mr. Talivaldis Ivars Smits, can be reached on (703) 306-301 1 . 
Any inquiry of a general natural or relating to the status of this application should be 
directed to the Technology Center 2600 receptionist whose telephone number is (703) 
305-4700. 

9. Any response to this action should be mailed to: 
Commissioner of Patents and Trademarks 
P.O. Box 1450 

Alexandria, VA 22313-1450 
Or faxed to: (703) 872-9314 
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Hand-delivered responses should be brought to Crystal Park II, 2121 Crystal Dr, 
Arlington VA, Sixth Floor (Receptionist, Tel. No. 703-305-4700). 

Art Unit 2655 
VINCENT V.TRAN yj 
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